A^T8^26SH0U^HHEflTTn)RCEPERS0HNELDflfflSVSTE!njS^«rft8flSE 

MACHINES’CU)  AIR  COMMAND  AND  STAFF  COLL  MAXWELL  AFB  AL 
H  T  ANDERSON  APR  87  ACSC-87-8118 


1/1 


UNCLASSIFIED 


F/G  12/7 


NL 


AD-A181  026 


0WC  hit  copy 


imi 

AIR  COMMAND 


AND 

STAFF  COLLEGE 


- STUDENT  REPORT - 

SHOULD  THE  AIR  FORCE  PERSONNEL 
DATA  SYSTEM  USE  DATABASE  MACHINES? 

MAJOR  MICHAEL  T.  ANDERSON  87-0110 

- insights  into  tomorrow” - 


DISCLAIMER 


The  views  and  conclusions  expressed  in  this 
document  are  those  of  the  author.  They  are 
not  intended  and  should  not  be  thought  to 
represent  official  ideas,  attitudes,  or 
policies  of  any  agency  of  the  United  States 
Government.  The  author  has  not  had  special 
access  to  official  information  or  ideas  and 
has  employed  only  open-source  material 
available  to  any  writer  on  this  subject. 

This  document  is  the  property  of  the  United 
States  Government.  It  is  available  for 
distribution  to  the  general  public.  A  loan 
copy  of  the  document  may  be  obtained  from  the 
Air  University  Interlibrary  Loan  Service 
(AUL/LDEX,  Maxwell  AFB,  Alabama,  36112)  or  the 
Defense  Technical  Information  Center.  Request 
must  include  the  author's  name  and  complete 
title  of  the  study. 

This  document  may  be  reproduced  for  use  in 
other  research  reports  or  educational  pursuits 
contingent  upon  the  following  stipulations: 

—  Reproduction  rights  do  not  extend  to 
any  copyrighted  material  that  may  be  contained 
in  the  research  report. 

—  All  reproduced  copies  must  contain  the 
following  credit  line:  "Reprinted  by 
permission  of  the  Air  Command  and  Staff 
College . " 


—  All  reproduced  copies  must  contain  the 
name(s)  of  the  report's  author(s). 

—  If  format  modification  is  necessary  to 
better  serve  the  user's  needs,  adjustments  may 
be  made  to  this  report — this  authorization 
does  not  extend  to  copyrighted  information  or 
material.  The  following  statement  must 
accompany  the  modified  document:  "Adapted 
from  Air  Command  and  Staff  Research  Report 
( number )  entitled  ( title)  by 

(author )  . " 

—  This  notice  must  be  included  with  any 
reproduced  or  adapted  portions  of  this 
document . 


■la.  fcaiA/k  .  A.' 


REPORT  NUMBER  87-0110 

TTTI jE  SHOULD  the  air  force  personnel  data  system  use  database  machines? 

AUTHOR(S)  MAJOR  MICHAEL  T.  ANDERSON ,  USAF 


FACULTY  ADVISOR  major  terry  l.  brooks,  acsc  3023  stus 


1 

SPONSOR  DAVID  essenpreis  ,  afmpc/dpmdx 


Submitted  to  the  faculty  in  partial  fulfillment  of 
requirements  for  graduation. 


AIR  COMMAND  AND  STAFF  COLLEGE 
AIR  UNIVERSITY 
MAXWELL  AFB,  AL  36112 


SiilitL 


REPORT  DOCUMENTATION  PAGE 


1a  REPORT  SECURITY  CLASSIFICATION 

UNCLASSIFIED 

1b.  RESTRICTIVE  MARKINGS 

2a  SECURITY  CLASSIFICATION  AUTHORITY 

3.  OlSTRI  BUTION/A  VAILABI  LIT  Y  OF  REPORT 

STATEMENT  ttA" 

2b.  OECLASSIFICATION/OOWNG  RACING  SCHEDULE 

Approved  for  public  reieassf 

A.  PERFORMING  ORGANIZATION  REPORT  NUMBER(S) 

5.  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 

87-0110 

6a  NAME  OF  PERFORMING  ORGANIZATION 

Sb.  OFFICE  SYMBOL 
(If  applicable) 

7a  NAME  OF  MONITORING  ORGANIZATION 

ACSC/EDCC 

6c.  AOORESS  (City.  Stata  and  ZIP  Coda) 

7b.  ADDRESS  (City.  State  and  ZIP  Code) 

Maxwell  AFB  AL  36112-5542 

NAME  OF  FUNDING/SPONSORING 
ORGANIZATION 


OFFICE  SYMBOL  9.  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 
(If  applicable) 


|Sc.  AOORESS  (City.  State  and  ZIP  Coda ) 

10.  SOURCE  OF  FUNDING  NOS. 

PROGRAM 
ELEMENT  NO. 

PROJECT 

NO. 

TASK 

NO. 

WORK  UNIT 
NO. 

11.  TITLE  (Include  Security  Claaaiflcation) 

SHOULD  THE  AIR  FORCE  PERSONNEL 

12.  PERSONAL  AUTHOR(S) 

Anderson,  Michael  T.,  Ma.lor.  USAF 


13a  TYPE  OF  REPORT  |  13b.  TIME  COVEREO 

i  =ROM _  TO  t 


IB.  SUPPLEMENTARY  NOTATION 

ITEM  11;  DATA  SYSTEM  USE  DATA  BASE  MACHINES?. 


15.  PAGE  COUNT 


COSAT  I  COOES 


18.  SUBJECT  TERMS  (Continue  on  reuerte  if  ntcemary  and  identify  by  Mock  number) 


FIELD  I  GROUP 


19.  ABSTRACT  (Contlnua  on  ravarta  if  met  story  and  identify  by  block  number) 


Database  management  systems  sure  in  wide  use  today  for  large  automated  information 
systems.  Special-purpose  computers,  called  database  machines,  sure  becoming  commercially 
available.  These  computers  sure  tailored  to  manipulate  structured  databases  very 
efficiently.  This  study  examines  the  suitability  of  one  such  machine  for  use  in  the 
Air  Force  Personnel  Data  System  (PDS).  The  study  concludes  that  the  machine,  the 
Teradata  DBC/1012,  hsis  the  potential  to  improve  the  performance  of  the  PDS,  and  should 
be  considered  for  purchase. 


ap.  OlSTRI BUTION/A VAILABI LIT Y  OF  ABSTRACT  21  ABSTRACT  SEC 

UNCLASSIFISO/UNLIMITSO  □  SAME  AS  RPT.  30  OTIC  USERS  □  UNCLASSIFIED 


32a  NAME  OF  RESPONSIBLE  INOIVIOUAL 

ACSC/EDCC  Maxwell  AFB  AL  36112-5542 


21  ABSTRACT  SECURITY  CLASSIFICATION 


mm i 


22c.  OFFICE  SYMBOL 


DO  FORM  1473,  83  APR  eoition  of  i  jan  73  is  obsolete.  _ UNCLASSIFIED _ 

SECURITY  CLASSIFICATION  OF  THIS  FAGE 


The  past  twenty  years  have  seen  significant  improvements  in  the 
computer  technology  available  tor  managing  large  amounts  of  information, 
progressing  from  simple  tile  systems  in  the  1960s  to  the  integrated 
database  managment  systems  ot  today.  For  some  time,  researchers  have 
been  studying  special-purpose  computers  designed  tor  the  efficient 
manipulation  of  large  databases.  Such  machines  are  now  becoming 
commercially  available.  This  study  is  an  initial  look  at  the  advantages 
of  using  one  of  these  “database  machines"  to  improve  the  performance  of 
the  Air  Force  Personnel  Data  System. 


ABOUT  THE  AUTHOR 


Major  Michael  T.  Anderson  has  had  a  varied  Air  Force  career.  He 
enlisted  in  1972,  immediately  after  graduating  from  college,  and  served 
tor  two  vears  as  an  administration  specialist  in  an  early-warning  radar 
detachaent.  He  was  coaaissioned  through  Officer  Training  School  in 
1974.  He  was  assigned  to  the  NORAD  Cheyenne  Mountain  Conplex  as  a  space 
svsteas  officer,  where  he  worked  in  the  aissile  warning  center  until 
1976,  when  he  was  transferred  to  Clear  AFS,  Alaska,  to  operate  the 
Ballistic  Missile  Early  Warning  System  radar.  In  1977.  he  moved  to 
Headquarters,  Strategic  Air  Command,  and  cross-trained  into  the  computer 
systems  career  field.  While  at  Hq  SAC,  Major  Anderson  developed 
several  new  intelligence  production  methods  for  the  SAC  Intelligence 
Data  Handling  System.  After  some  time  out  for  school  assignments  (AFIT 
Master  s  program  and  Squadron  Officer  School),  he  was  assigned  to  the 
Air  Force  Militarv  Personnel  Center  (AFMPCi  as  a  personnel  data  systems 
officer  until  August  19B6.  For  the  last  three  years  of  his  tour  at 
AFMPC ,  Major  Anderson  served  as  the  chief  database  administrator  and 
designer  for  the  Air  Force  Personnel  Data  System  (PDS)  ,  and  was 
instrumental  in  the  successful  migration  of  the  headquarters-1 evel  PD5 
between  two  different  makes  of  computers. 

Major  Anderson  holds  a  Bachelor  of  Science  in  Physics  from  the 
University  of  California,  Riverside?  a  Master  of  Science  in  Computer 
Engineering  from  Stanford  University?  and  is  a  distinguished  graduate  of 
the  Air  Force  Squadron  Officer  School.  In  addition  to  his  military 
experience  and  academic  education,  he  has  taught  computer  science  for 
several  years  at  both  the  graduate  and  undergraduate  levels,  at  St. 

Mary  s  University,  San  Antonio,  Texas. 


i  v 


V-'- 


TABLE  OF  CONTENTS 


i  i  i 
vi 
vi  i 


CHAPTER  1  -  THE  PROBLEM 

Workload  of  the  Air  Force  Personnel  Data  System 

The  Shortfalls  of  REACQ . . . . 

A  Possible  Solution . . . . . . 


1 

1 

2 


IM 


►  v 

k*>\. 

s 


CHAPTER  2  -  THE  AIR  FORCE  PERSONNEL  DATA  SYSTEM 

Missions . . . . .  3 

System  Descriotion . . 

CHAPTER  3  -  TECHNOLOGICAL  BACKGROUND 

Introduction . 

Database  Management  Systems..... 

Database  Machines . 

CHAPTER  4  -  COMPARISON  OF  DATABASE  MACHINE  AGAINST  A  CONVENTIONAL 


DATABASE  MANAGEMENT  SYSTEM 

Selecting  a  Database  Machine  for  the  PDS . 15 

Qualitative  Factors . 15 

Quantitative  Factors .  17 

CHAPTER  5  -  CONCLUSIONS,  FINDINGS.  AND  RECOMMENDATIONS 

Introduction .  21 

Conclusions . 21 

Findings . 22 

Recommendations . 23 

BIBLIOGRAPHY .  25 


5 

5 

11 


LIST  OF  ILLUSTRATIONS 


TABLES 


TABLE  1 — Commercial  Database  Machines .  13 

TABLE  2--AA  Performance  Estiaates .  20 

TABLE  3--Prices  of  Teradata  DBC/1012  Systems .  20 


FI6URES 


FISURE  1 — Hierarchical  Database  Model.... . . .  6 

FIGURE  2 — Network  Database  Model............. . . .  8 

FI6URE  3 — Relational  Database  Model .  10 

FIGURE  4 — Teradata  DBC/1012  Configuration....... .  16 

FIGURE  5 — Teradata  DBC/1012  Perforaance  Curves .  19 


vi 


c  J*  $ 


EXECUTIVE  SUMMARY 


Part  of  our  College  mission  is  distribution  of  the 
students’  problem  solving  products  to  DoD 
sponsors  and  other  interested  agencies  to 
enhance  insight  into  contemporary,  defense 
related  issues.  While  the  College  has  accepted  this 
product  as  meeting  academic  requirements  for 
graduation,  the  views  and  opinions  expressed  or 
implied  are  solely  those  of  the  author  and  should 
not  be  construed  as  carrying  official  sanction. 


‘insights  into  tomorrow ‘ 


REPORT  NUMBER  57  01 10 

AUTHOR(S)  MAJ0R  MICHAEL  T-  ANDERSON.  USfiF 

TTTI  F  SHOULD  THE  AIR  FORCE  PERSONNEL  DATA  SYSTEM  USE 

database  machines? 


I.  Purpose:  To  determine  whether  the  performance,  cost,  and 
reliability  of  the  Air  Force  Personnel  Data  System  can  be  significantly 
improved  bv  installing  a  commercially  available  special-purpose  comouter 
dedicated  to  database  management  tasks  (a  "database  machine"). 

II.  Problem:  The  PD5  has,  for  years,  been  constrained  in  the  services 
it  can  provide,  primarily  due  to  performance  limitations  imposed  by  its 
comouter  svstems.  The  hardware  reacoui si t i on  project  (REACQi  of  1983-85 
was  an  attemot  to  deal  with  this  problem,  but  was  onlv  partially 
successful.  A  new  approach  to  data  processing  mav  offer  a  breakthrough 
in  the  performance  level  of  the  PDS.  thereby  making  personnel  data 
services  cheaper,  more  reliable,  and  more  available. 

III.  Data:  The  PDS  includes  a  wide  range  of  applications,  ranging  from 
traditional  record-keepi ng  to  recruiting  to  force  structure  modeling  to 
formal  training  management,  and  much  more.  Utilization  statistics  show, 
however,  that  nearlv  80X  of  the  entire  central-site  comouter  resource  is 
dedicated  to  onlv  a  few  general  categories:  the  master  personnel  tiles, 
and  a  few.  verv  active,  online  information  svstems  (PMS.  PR0MI5.  and 
SURF).  These  svstems  are  verv  much  "input-output  intensive."  spending 
most  of  their  time  reading  or  writing  data  to  disk  devices. 

Furthermore,  all  of  these  svstems  use  a  database  management  svstem 


_ _ CONTINUED _ 

i DBMS)  to  handle  their  I/O  ooerations.  Anv  effort  to  improve  the 
performance  of  the  PD5  must  address  these  svstems.  and  must  certainlv 
address  the  performance  of  the  underlying  DBMS.  Traditional  database 
architectures  have  been  tailored  for  existing  mainframe  comouters  in 
Mhat  has  been  described  as  a  "mach i ne-f r i endl v"  design.  These 
archi tectur es  are  geared  for  “record-at-a-time"  processing,  and  use 
intricate  chain  or  tree  structures  to  speed  up  the  process  of  finding 
data.  Recently,  the  more  “user-friendly"  relational  database 
architecture  has  become  commercially  available  in  packages  which  are 
reasonably  efficient  for  smal 1 -to-medi um  sized  applications,  but  this 
"elegant"  approach  to  database  management  has  been  too  computationally 
intensive  to  be  successful  for  large  information  svstems  with  manv 
users.  Special-purpose  computers,  called  "database  computers"  or 
"database  machines,"  have  recently  appeared  on  the  commercial  market. 
These  comouters  offer  the  capability  of  applying  sufficient 
computational  oower  to  the  needs  of  a  relational  database  architecture 
to  make  its  performance  comparable  to,  or  even  better  than,  traditional 
database  management  systems.  As  of  this  writing,  onlv  one  commercially 
available  database  machine  has  the  storage  capacity  and  hardware 
comoatibi 1 itv  necessary  for  the  PD5;  that  machine  is  the  Teradata 
DBC/1012.  A  case-studv  comparison  of  potential  performance,  rating  the 
DBC/1012  against  the  current  Honeywell  DPS-8/70  DM-IV  svstem.  shows  that 
the  DBC/1012  offers  several  distinct  advantages.  The  DBC/1012  has  the 
potential  for  improved  performance,  simpler  backup  and  recovery 
operations,  less  susceptibility  to  failure,  and  a  greater  capacity  for 
growth . 

IV.  Conclusions:  The  Teradata  DBC/1012  database  machine  comoares 
favorablv  with  the  mainframe  DBMS  operating  at  AFMPC.  It  offers 
improvements  in  svstem  performance,  ease  of  operation,  reliability,  and 
ease  of  software  develooment. 

V.  Recommendation:  The  Director  of  Personnel  Data  Svstems.  Air  Force 
Militarv  Personnel  Center,  should  begin  action  to  develop  orototvoe 
relational  database  software  for  the  purpose  of  benchmarking  the 
Teradata  DBC/1012  database  comouter.  If  the  benchmark  demonstrates 
asignificant  performance  iraorovement  over  the  existing  mainframe  DBMS, 
the  Director  should  take  action  to  procure  a  DBC/1012  large  enough  to 
support  the  database  portion  of  the  PDS. 


v  i  i  i 


Chapter  1 


THE  PROBLEM 


WORKLOAD  OF  THE  AIR  FORCE  PERSONNEL  DATA  SYSTEM 


The  US  Air  Force  Personnel  Data  System  (PDS).  operated  from  the 
central  site  at  the  Air  Force  Military  Personnel  Center  (AFMPC/. 

Randolph  AFB.  Texas,  is  the  largest  personnel  data  svstem  in  the  federal 
government.  It  ooerates  24  hours  a  dav.  seven  davs  a  week,  managing  the 
oersonnel  records  of  over  one  million  active  dutv.  Air  Reserve  Forces, 
and  civilian  personnel.  It  supports  over  150  Air  Force  bases  worldwide, 
as  well  as  some  fiftv  major  air  commands,  separate  operating  agencies, 
and  intermediate  headauarters.  Since  the  mid-1970s,  the  PDS  has 
suffered  caoacitv  problems,  forcing  the  limitation  or  curtailment  of 
certain  data  processing  services.  In  the  late  1970s.  an  initiative  to 
expand  the  caoacitv  of  the  PDS  was  begun,  which  would  improve  the  svstem 
through  the  ourchase  of  new  como.uter  hardware.  In  the  fall  of  1982.  the 
Honevwel 1  Corporation  was  awarded  the  contract  for  the  AFMPC 
Reacquisition  Project  (REACQ).  The  goal  of  REACQ  was  to  replace  the 
existing  Burroughs  6700  computer  complex  with  newer,  high-performance 
mainframe  comouters  to  improve  the  overall  oerformance  and  caoacitv  of 
the  PDS. 


THE  SHORTFALLS  QF  REACQ 

The  new  comouter  hardware  purchased  under  REACQ  reauired  that  most 
of  the  software  for  the  PDS  be  modified  to  operate  on  the  new  computers: 
this  modification  was  comoleted  in  the  fall  of  1985.  In  the  three  vears 
since  the  initial  contract  award,  the  nature  and  extent  of  much  of  the 
central  site  software  for  the  PDS  had  changed,  but  the  fundamental 
mission  and  modes  of  processing  remained  basicaliv  the  same.  Some  data 
processing  applications  had  grown  significantly,  or  had  been  greatlv 
improved,  others  became  obsolete  or  radically  altered.  Overall,  the 
software  supporting  the  PDS  was  adapted  to  fit  the  new  hardware  and 
software  environment  without  anv  serious  deficiencies  or  loss  of 
caoabilitv.  However,  one  major  problem  remains.  As  of  todav.  the  PDS 
is  again  facing  a  capacity  problem,  similar  to  that  of  the  mid-1970s. 
While  the  reolacement  of  computers  did  provide  the  PDS  with  new 
capacity,  the  modification  of  old  software,  and  the  development  of  new. 
has  absorbed  much  of  that  added  caoacitv.  Todav.  the  average  user  of 
the  PDS  in  the  major  air  command  or  Air  Staff  office  sees  little 


improvement  in  the  overall  oerformance  of  the  PDS  as  comoared  to  ten 
vears  ago. 

A  POSSIBLE  SOLUTION 

Siven  this  somewhat  drearv  history,  is  the  PDS  doomed  to  continuing 
cvcles  of  "plaving  catch-up.'1  attempting  to  keep  computer  capacity  ahead 
of  workload?  Traditional  comouter  svstem  caoacitv  planning  methods 
might  lead  to  that  scenario.  Unf or tunatel v .  the  Air  Force  has  neitner 
the  time  nor  monev  to  continue  ooerating  in  this  fashion.  Fortunately, 
the  advent  of  several  new  hardware  and  software  technologies  give  us  the 
capability  to  break  out  of  the  “catch-uo"  cvcle.  One  of  the  oromising 
technologies  of  the  1980s  is  the  backend  database  machine.  This  is  a 
special-purpose  computer  designed  to  extend  the  capacity  of  a 
general-purpose  computer  svstem  and  prolong  its  useful  lifetime  bv 
adding  processing  capacity  without  the  need  for  a  complete  overhaul  of 
software  and  a  complete  replacement  of  comouter  hardware  at  a  data 
processing  installation.  In  this  paper  I  will  attempt  to  exolain  the 
advantages  of  a  database  machine,  and  evaluate  its  usefulness  in  the 
operation  of  the  PDS.  Based  on  this  evaluation.  I  will  answer  the 
question  "Should  a  database  machine  be  used  as  an  integral  oart  of  the 
Air  Force  Personnel  Data  Svstem?" 


Chaoter  2 


THE  AIR  FORCE  PERSONNEL  DATA  SYSTEM 


MISSIONS 

The  Air  Force  Military  Personnel  Center  is  tasked  with  operating 
the  overall  personnel  system  For  all  members  of  the  US  Air  Force.  To  do 
this,  the  center  is  authorized  to 

.  .  .  develop  and  implement  oolicv  concerning  accessions 
testing,  classification,  worldwide  distribution  and  managment 
of  personnel,  automated  personnel  svstems.  military  personnel 
records  svstems,  standard  personnel  operations,  programs, 
officer  and  airman  performance  evaluation,  promotion  testing, 
reenlistment  and  retention,  leave,  survivors  benefits,  escort 
and  deoendent  travel,  awards  and  decorations,  appearance 
standards,  nonappropr i ated  fund  manpower  r equi r ements ,  Morale. 
Welfare,  and  Recreation  (MWR)  activities,  active  dutv  service 
committments,  specified  period  of  time  contracts,  and  overseas 
tour  lengths'.  Assist  in  the  development  and  implementation  of 
policies  pertaining  to  procurement  .  .  .  oromotion  of  officer 
and  enlisted  members,  demotion  of  enlisted  members,  desertion, 
absent  without  leave  < AMOL ) .  Reaular-Reserve-temporarv  Air 
Force  appointments,  separations,  retirements,  flving  status, 
service  dates,  Indefinite  Reserve  Status,  and  Social  Actions 
oragrams.  (16:1) 

A  key  tool  in  operating  a  personnel  svstem  dealing  with  over  1  million 
people  (6:190)  is  a  extensive  computer  system.  AFR  23-33  gives  AFMPC 
the  authoritv  for  ooeratino,  scheduling,  and  maintaining  the  central 
site  facilitv  for  the  Personnel  Data  Systems  computers  and  peripheral 
equipment  to  support  the  Air  Staff.  AFMPC.  USAFR.  AN6.  MAJCOMs.  separate 
operating  agencies,  and  base-level  consolidated  base  personnel  offices. 
(16:2) 


SYSTEM  DESCRIPTION 

The  Air  Force  Personnel  Data  Svstem  is  a  collection  of  computer 
svstems  located  at  each  Air  Force  base.  Headouarters  US  Air  Force,  and 
at  the  Air  Force  Militarv  Personnel  Center  at  Randoloh  Air  Force  Base. 
Texas.  These  svstems  are  interconnected  with  a  variety  of 


communications  svstems.  including  high-speed  data  links.  low-soeed 
telephone  lines,  and  AUTODIN  Message  connections.  The  hub  of  the  entire 
personnel  data  processing  effort  is  the  central  site  at  the  Military 
Personnel  Center  (MPC).  The  central  site  at  MPC  is  resoonsible  for  the 
Maintenance  of  a  single  centralized  repository  of  data  used  bv  the 
entire  Air  Force.  A  “Master  record"  is  keot  for  every  individual  in  the 
active  and  reserve  forces  in  the  Master  oersonnel  files.  The 
schedules,  class  rosters,  and  manaoaent  inforaation  for  ail  Air  Force 
foraal  training  are  maintained  in  the  Pioeline  Managnent  System  (PMS;. 
New  Air  Force  recruits  are  matched  against  job  requirements  and  training 
schedules  using  the  Procurement  Management  Information  Svstem.  or 
PROM  I S .  The  specialized  information  for  airman  and  officer  promotions 
is  keot  uo-to-date  and  is  used  bv  oroaotion  boards  and  the  Weighted 
Airman  Promotion  Svstem.  Finally,  several  other  saalier  collections  of 
data  are  maintained  for  such  diverse  areas  as  force  structure,  officer 
assignments.  Congressional  inquiries,  and  the  Air  Force  Suggestion 
Proaram.  (14:1-2  -  1-7) 


Chapter  3 

TECHONQLOG I C AL  BACKGROUND 

INTRODUCTION 


Database  management  systems  (DBMSs)  are  complex  combinations  of 
computer  hardware  and  software.  Originally,  information  on  a  computer 
system  was  siaplv  aanaged  using  the  vendor-supplied  file  system,  part  of 
the  operating  system  furnished  with  the  computer.  Over  the  years, 
however,  specialized  software  and  hardware  have  been  developed  to  handle 
large  databases,  for  a  variety  of  reasons.  These  range  from  efforts  to 
increase  programmer  productivity  to  providing  built-in  protection  for 
data  checking  and  recovery.  Initially,  the  emphasis  was  placed  on 
specialized  software,  running  on  general-purpose  computers.  More 
recently,  special-purpose  computers  have  been  designed  to  use 
'streamlined"  versions  of  this  database  management  software. 

DATABASE  MANAGEMENT  SYSTEMS 

Database  management  systems  have  “come  into  their  own"  in  the  last 
decade,  primarily  because  of  reliability  and  ease  of  use.  Prior  to  the 
widespread  use  of  DBMSs  in  commercial  and  government  applications, 
programmers  used  their  own  “homegrown*  file  structures.  Thev  often 
spent  as  ouch  time  their  working  on  their  file  systems  as  they  did 
developing  and  maintaining  applications  software  (1:12).  The  first 
DBMSs  were  systems  developed  by  vendors  without  any  real  theoretical 
basis.  Of  these,  only  International  Business  Machines'  Information 
Management  System  (IMS)  remains  in  successful  commercial  operation. 

This  can  be  attributed  as  much  to  the  large  numbers  of  installations 
with  the  software  installed  as  it  can  to  the  efficiencies  of  the  svstem 
(1:503) . 

Hierarchical  Database  Manaoement  Systems 


Initially,  the  database  systems  were  hierarchical  files  svstems. 

In  these  systems,  information  is  represented  in  a  group  of  tree-like 
structures.  For  each  related  group  of  objects,  there  is  one  "parent" 
object  at  the  root  of  the  group  (in  hierarchical  database  jargon,  the 
record).  All  objects  (called  segments)  are  connected  logically  and 
physically  in  parent-child  relationships.  Most  importantly,  every 
segment  except  the  root  segment,  can  only  be  reached  after  finding  all 
of  its  parent  segments  (reference  Figure  1).  Applications  requirements 
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rapidly  outstripped  the  capability  of  this  “Dure*  hierarchical  data 
structure,  and  vendors  responded  to  user  requirements  with  additional 
data  structures  to  simplify  and  speed  up  access  to  data  in  hierarchical 
databases.  Today  IMS  has  the  ability  to  partition  database  records  so 
that  applications  may  reach  certain  intermediate  "roots*  directly,  and 
index  structures  allow  preselected  types  of  child  segments  to  be  located 
based  on  the  value  of  one  or  more  data  items  within  that  segment 
(1:529-533).  Other  vendors  hierarchical  systems  were  similarly 
modified;  Burroughs  Corporation's  DMS-II  uses  treelike  and  bit-mapped 
indexing  structures,  1 i nked-1 i sts ,  and  allowed  programmer-defined 
pointers  to  link  objects  in  the  database  in  arbitrary  patterns 
<21:4-85,  4-137  -  4-138). 

Network  Database  Management  Systems 


By  the  late  1960s,  vendors  had  so  modified  and  diversified  their 
database  management  systems  that  the  art  of  software  development  using 
database  systems  was  becoming  chaotic.  The  gains  in  portability  and 
maintainability  made  with  the  standardization  of  most  business  software 
around  the  Common  Business-Oriented  Language  (COBOL)  standard  was  being 
lost  in  the  growing  confusion  of  differing  database  management  systems. 
The  American  National  Standards  Institute,  a  joint  commercial  and 
governmental  standards  group  had  within  it  the  Committee  on  Data  Systems 
and  Languages  (CODASYL).  CQDASYL  had  been  successful  in  the  past  in 
developing  standards  for  the  COBOL  and  other  programming  languages.  In 
1969,  CODASYL  formed  the  Data  Base  Task  6roup  (DBT6)  ,  which  set  out  to 
develop  a  single,  agreed-upon  industry  standard  for  database  management 
systems.  The  US  government,  academia,  and  major  computer  vendors  were 
all  represented  on  the  group.  They  developed  a  specification  for  a  new 
type  of  database  management  system,  one  which  was  a  significant 
enhancement  of  the  capabilities  of  the  existing  hierarchical  systems 
(13:215-216) . 

The  new  standard,  the  network  database,  had  several  new  features. 
While  data  could  still  be  represented  in  hierarchies,  objects  in  a 
database  no  longer  were  restricted  to  a  single  type  of  parent  object  (in 
DBT6  jargon,  an  owner  record).  Many  different  structures  could  exist  in 
the  same  database,  and  access  to  various  types  of  objects  could  be 
improved  by  specifying  the  general  method  used  to  find  and  store  each 
different  type  of  record  in  a  database  (reference  Figure  2).  The 
difference  in  database  structure  is  significant  since  several  fast 
access  paths  could  now  be  built  into  a  database  instead  of  onlv  one.  In 
addition,  maintenance  of  the  database  was  improved  bv  completely 
separating,  for  the  first  time,  the  logical  specification  (what  record 
types  there  were,  and  how  thev  were  connected)  from  the  physical 
specification  (how  large  disk  files  would  be,  how  manv  records  could  fit 
on  a  database,  and  so  on)  (1:541-548). 


Structure  Diaarae 


DEPARTMENT 


DEPTNAME  CHAIRMAN 


COURSE#  CATALOS 


COURSE 


COURSE#  TITLE  I  CREDITS 


SCHEDULE 

0FFERIN6 


OFF#  SEMESTER  LOCATION  INSTRUCTOR 


DEPT-MAJORS  STUDENT# 


STUDENT 


STUDENTNAME  STUDENT# 


CLASS-ROSTER 


CLASS-SCHEDULE 


ENROLLMENT 
I6RADE  I 


ATAITEM 


direct  access  oath 


Occurrence  Diaaran 


DEPARTMENT _ 

Phvsics  I  Heisenberal 


COURSE  4i 


STUDENT  I T 


FIGURE  2.  NETWORK  DATABASE  <1:546) 


Like  the  hierarchical  svstees  before  it.  the  "pure11  network  aodel 
has  been  modified  bv  various  vendors  to  enhance  its  utilitv  and  soeed. 
Todav.  eost  network  svstees  (notablv  Honeywell  Corporation  s  IDS2  and 
Cullinet  s  I DM5 )  otter  treelike  indexes  and  oointer  arravs  to  auoeent 
data  structuring  and  iaorove  perforaance  (1:561-568). 

The  Relational  Database  Model 

Even  as  network  database  management  svstees  were  being  developed  tor 
coeeercial  use  in  the  early  1970s.  their  replaceeent  was  being  created. 
In  1970.  E.F.  Codd  published  his  seeinal  paper  on  relational  database 
theorv  (7: — ).  Codd  telt  that  databases  were  entirely  too  ad  hoc.  and 
sought  to  develoo  a  theoretical  toundation  which  would  turnish  usetul 
tools  to  database  designers,  programmers.  and  users,  without  being 
needlesslv  complicated.  Using  the  concept  ot  the  relation,  a  simple 
table  ot  data,  he  demonstrated  that  this  structure  could,  in  theorv. 
satistv  the  logical  requirements  ot  anv  database  application.  Further 
work  bv  Date  at  IBM  San  Jose  and  his  now  classic  book  on  database 
svstems  (1: — )  oooularized  the  concept. 

The  relational  aooroach  abandons  the  idea  ot  elaborate  structures 
tor  data  in  tavor  ot  the  simole  tabular  reoresentation.  Everv  "object" 
in  the  database  is  simolv  a  "line  entrv*  or  row  in  one  ot  manv  tables  or 
relations  comonsing  the  database.  It  the  rows  in  ditterent  tables  are 
logically  related,  thev  are  linked  together  with  matching  data  values  in 
corresoondi no  tields,  called  columns  (reference  Figure  3).  The  lack  of 
comolicated  ohvsical  data  structure  is  matched  with  a  verv  simole 
orogramming  language:  this  combination  makes  relational  database 
management  svstems  much  easier  to  learn  and  use.  for  both  programmers 
and  users  (14:217). 


The  relational  database  architecture  is  attractive  for  several 
reasons.  Simolicitv  of  use  and  ooeration  is  the  major  reason,  but 
certainlv  not  the  onlv  one.  The  readv  availability  of  third-oartv 
aoolications  and  oroductivitv  tools  is  another  big  benefit.  ( 9 : 144) 

Verv  recently,  the  advent  of  high-performance  relational  svstems.  such 
as  IBM's  DB2.  has  made  them  attractive  to  "too  end”  data  processors  for 
whom  sheer  Dower  was  an  overwhelmino  reouirement  (8:--). 
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DATABASE  MACHINES 


Oevel ooaent 

Paralleling  the  evolution  of  database  software  in  the  late  1970s.  it 
began  to  becoae  apoarent  that  traditional  coeouter  architectures  were 
ill-suited  lor  database  aanageeent  tasks.  The  traditional,  or  "von 
Neueann.*  coeouter  architecture  is  oriearilv  an  aritheetic  orocessor. 
ooerating  on  a  single  "word"  of  data  at  a  tiee  (usually  a  nuaber.  hence 
the  tera  aritheetic  logic  unit,  or  ALU).  Database  aanageeent  on  the 
other  hand,  often  requires  that  aanv  "words'*  o-f  data  be  exaained  at  a 
tiae.  often  in  coaolex  wavs  (3:1-2).  For  exaaole:  How  aanv  rated 
captains  with  aore  than  six  vears  of  coaaissioned  service  currently  have 
less  than  two  vears  on  station?  What  are  their  deoendents  names?  At 
the  saae  tiae.  new  conceots  in  coaputer  hardware  were  beginning  to 
deaonstrate  significant  perforaance  iaproveaents  in  arithaetic 
processing.  Pipelined  architectures  are  faster  because  coaputer 
instructions  are  executed  in  an  "assembly  line"  fashion:  different  Darts 
of  different  instructions  are  executed  si aul taneousl v  (4:1145-1148). 
Parallel  architectures  are  faster  because  several  coaplete  coaouter 
instructions  are  executed  si aul taneousl v  (4:1100-1104).  Researchers 
began  looking  at  the  idea  of  soeciallv  configured  or  designed  coaouters 
for  database  aanageaent. 


In  both  aoproaches.  the  database  services  were  reaoved  froa  a 
centralized  coaouter  and  olaced  on  a  “backend"  orocessor.  This  backend 
orocessor  coaaunicates  with  the  aainfraae  using  an  agreed-uoon  set  of 
aessages  transaitted  over  a  high  soeed  data  link.  Whenever  an 
aoplications  orograa  needs  to  use  a  database,  instead  of  directlv 
invoking  the  DBMS  software,  it  siaolv  sends  a  aessage  to  the  backend 
processor.  The  backend,  then,  receives  the  aessage.  perforas  the 
requested  function,  and  sends  an  acknowl edgeaent  or  answer  back  to  the 
waiting  aoplication  orograa  on  the  aainfraae  (3:13-14). 


The  Different  Approaches 

Efforts  fell  into  two  aain  categories,  the  software  and  hardware 
aooroaches.  Software  backends  are  standard,  "off-the-shelf"  coaouter 
svsteas  running  specially  "tuned"  database  aanageeent  and  operating 
svstee  software  to  deliver  high  oerforaance.  Exaaoles  on  the 
coaaercial  aarket  are  the  Bell  Laboratories  XDMS  svstea  and 
Britton-Lee  s  Intelligent  Data  Manageaent  Svstea  (IDMSi.  Other 
develooers  felt  soecialized  hardware  was  necessary  to  achieve  high 
oerforaance:  this  aooroach  has  lead  to  the  hardware  backends.  These 
coaouters  tvpicallv  used  either  oarallel  or  pipelined  ar chi tectur es  to 
allow  the  coaouter  to  orocess  several  records  or  oarts  of  records  at  the 
saae  tiae.  thereby  achieving  greatlv  laoroved  oerforaance  through 
oarallelisa.  Exaaoles  on  todav  s  coaaercial  aarket  include  Intel 
Corooration  s  Intelligent  Database  Processor  uDBP  Csicl;  and  Teradata  s 
Database  Coaouter  (DBC)  (3:318-319). 
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Because  of  the  sieolicitv  of  the  relational  data  eodel .  most 
database  aachines  use  it  as  their  underiving  data  eodel  (reference  Table 
1).  However,  the  hierarchical  and  network  models  are  also  supported  by 
Intel  Coroorati on ' s  iDBP  (12:52-53). 


Advantages  of  Database  Machines 

The  database  machine  approach  offers  several  distinct  advantages 
over  conventional  general-purpose  computers.  These  advantages  fall  into 
three  main  categories:  ease  of  exoansion.  improved  performance,  and  low 
cost . 

For  some  installations,  it  is  much  easier  to  expand  the  capacitv  of 
the  comouter  svstem  bv  adding  a  database  machine  than  to  reolace  the 
entire  comouter.  This  expansion  can  have  a  dramatic  effect,  as  in  the 
case  of  one  installation  using  the  Teradata  DBC/1012.  which  saw  over  99% 
of  its  data  manipulation  workload  moved  from  the  mainframe  computer  to 
the  backend  database  machine  <10:54) .  Along  with  this  capacitv 
expansion  can  come  a  significant  performance  improvement. 

Database  machines  are  specialized  processors,  providing  very  fast 
database  ooerations.  Many  system  developers  like  the  flexibility  of  the 
relational  database  model,  but  dislike  the  poor  oerformance  of  these 
same  systems.  Bv  offering  a  machine  designed  around  the  relational  data 
model,  vendors  have  made  relational  databases  efficient  to  use  (9:139;. 


Perhaps  mdst  important,  database  machines  are  relatively  cheap.  The 
processing  oower  of  a  database  machine  is  on  the  order  of  one-tenth  to 
one-fourth  the  price  of  that  of  a  mainframe  computer.  Teradata 
Corporation  comoares  one  of  its  larger  svstems,  a  60-orocessor  version 
of  the  DBC/1012.  priced  at  *1.7  million,  to  the  IBM  3O04Q.  costing  *6.2 
million  (10:63). 


Disadvantages 

In  spite  of  ail  the  excitement  about  database  aachines.  there  are 
some  disadvantages  to  them  as  well.  The  major  ones  are  software 
coaoat i bi 1 i t v .  communications  overhead,  and  eouiDment  maintenance. 
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Software  comoatibi 1 l tv  is  the  single  largest  oroblea  in  the  minds  of 
data  orocessing  managers.  Older  oroduction  orograms  often  have  been 
built  with  file  or  database  structures  other  than  the  relational  model, 
and  conversion  is  usually  a  tedious,  exoensive.  and  error-orone  orocess. 
Some  data  orocessing  managers  simplv  acceot  the  i ncoaoat l bi l i t v  as  a 
necessary  evil.  Thev  ourchase  database  aachines  for  their  new 
aoolications  in  the  hooe  that  the  older,  incomoatible  software  will 
become  obsolete  and  "whither  awav“  (13:87). 
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Coaaunications  overhead  has  Iona  been  a  concern  of  database 
theoreticians.  Date  felt  that  database  aachines  Mere  inherently  lieited 
bv  the  coaaunications  channels  connecting  the  aainfraae  to  the  backend 
Machine  v 2 :  348-359) .  The  svstea  Mhich  best  deals  Mith  this  concern  is 
the  Teradata  DBC/1012.  Mhich  uses  a  specially  designed  coeeumcations 
netMork  called  a  Ynet  to  elieinate  all  but  the  heaviest  coeaunications 
traffic  (12:48-50!  17;v.  7-6  -  7-Bi. 

Finally,  eauioeent  Maintenance  can  be  a  oroblee.  Introducing  a 
database  eachine  into  a  data  orocessing  facility  adds  aore  ooints  of 
failure,  and  on  eauioeent  dissieilar  to  that  already  installed. 
Coeaercial  database  eachine  Manufacturers  are  dealing  Mith  this  oroblem 
Mith  a  variety  of  aoproaches.  They  use  reliable,  off-the-shelf 
coaoonents:  orocessors  are  built  to  be  fault-tolerant:  and  systems  are 
built  in  Modules  Mhich  can  be  repaired  Mithout  shutting  doMn  an  entire 
svstea  <12:53-54.  17:iv). 
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COMPARISON  OF  DATABASE  MACHINE  AGAiNST 
A  CONVENTIONAL  DATABASE  MANAGEMENT  SYSTEM 


SELECTING  A  DATABASE  MACHINE  FOR  THE  PDS 

The  current  PDS  databases  occuov  some  84  gigabytes  ot  storage  on  50 
ohvsical  disk  drives.  The  software  using  this  data  runs  on  tour 
Honevwell  DPS-S/70  mainframe  comouters  connected  to  these  disx  drives 
il8:Aua  86  -  Atch  1).  Anv  database  machine  useful  to  the  PDS  must  have 
the  disk  caoacitv  to  hold  the  PDS  databases  and  be  hardware  and  software 
comoatible  with  the  existing  PDS  mainframe  computers.  As  the  data  in 
Table  1  indicates,  the  Teradata  DBC/1012  is  the  onlv  commercial  1 v 
available  svstem  which  meets  these  basic  criteria. 

Relatively  Few  DBC/1012s  are  installed  today:  almost  all  are  at  IBM 
installations.  However,  the  Honeywell  Corooration  is  working  with 
Teradata  Corooration  to  develoo  a  connection  between  the  Honevwell 
SCOS-8  operating  svstem  (used  bv  Honeywell  mainframes)  and  the  DBC/1012. 
This  capability  is  expected  to  be  available  sometime  in  1987  (21:--). 

Because  or  the  small  installed  base  of  DBC/1012s  and  the  absence  of 
commercial  Honevwell  users,  oerrormance  data  is  somewhat  limited.  This 
comoarisan  will  therefore  evaluate  the  svstem  on  Qualitative  as  well  as 
Quantitative  factors,  in  an  attemot  to  examine  all  oossible  asoects  of 
svstem  performance. 


QUALITATIVE  FACTORS 

Reliability 

The  DBC/1012  has  several  features  which  make  it  verv  reliable.  Most 
obvious  is  the  "fallback11  capability  for  “mirroring"  valuable  data.  For 
those  parts  of  the  database  where  fallback  is  soecified  bv  the  database 
administrator,  all  the  data  is  duolicated  on  seoarate  disk  storage  units 
iDSUs)  within  the  svstem.  Should  a  D5U  fail,  ail  of  its  primary  data 
ut.  too.  contains  mirrored  data)  would  be  unavailable,  but  the  bacxuo 
copy  of  the  data  wouid  be  oresent.  and  would  immediately  be  available 
for  use.  This  allows  vital  data  and  applications  to  ooerate  virtually 
uni nterr uoted .  The  oenaltv  for  this  level  of  reliability  is.  of  course, 
that  twice  as  much  disk  soace  is  reouired  for  orotected  data. 


FIGURE  4.  TERADATA  DBC/1012  CONFIGURATION 


In  addition  to  the  ralioacic  orotection.  t  n  er  *  irP  omer 

•features  which  add  to  the  svstem  s  reiiaoiiitv.  'k  DbC  :s  omit 

almost  entirely  with  off-the-shelf  comoonents.  sucn  n  tne  Intel  SGBo 
orocessor  and  the  Winchester  disk  drive  »12:53-54/.  nl i  tnese  oarts  are 
hi ah-oerf ormance .  verv  reliable,  and  relatively  inexoensive.  Tne  svstee 
is  built  in  nodules .  with  the  f net  coeeun 1  cations  link  serving  as  the 
onlv  connection  between  the  various  Drocessor  units  ireterence  Figure 
4).  This  "loose  connection"  allows  an  ailing  orocessor  to  De 
disconnected  fro*  the  system  ana  reoaired  or  reoiacea  wmle  tne  rest  of 
the  conputer  continues  to  ODerate  vl2:54> .  The  rnet  itself  is  a 
dual-channel  communications  link,  which  can  continue  ooerating  at  a 
reduced  rate  even  if  one  channel  should  fail  tl7:7-oi.  In  addition, 
the  rnet  can  furnish  data  to  the  user  in  sorted  order,  indeoenaent  of 
the  ordering  of  the  data  on  the  disk  storage  units  v!2:47-49> . 

Caoaci tv 

Besides  reliability,  the  DBC/1012  provides  excellent  capacity,  both 
for  the  workload  it  is  originally  acquired  for.  and  for  future 
expansion.  The  svstem  can  be  configured  to  stare  up  to  2.1  terabvtes  or 
2150  gigabvtes.  the  equivalent  of  8600  bvtes  of  data  for  everv  oerson  in 
the  United  States  or  430  bvtes  of  data  for  everv  person  on  Earth! 

Because  the  disk  space  can  be  added  in  increments  as  small  as  474 
megabytes,  it  is  relatively  easv  to  build  a  system  which  is  exactly  the 
right  size  for  a  given  application.  As  processing  or  storage 
requirements  increase,  processing  units  (IFPs  and  AMPs.  reference  Figure 
4)  can  be  added  to  maintain  or  improve  performance,  and  the  system  will 
automatically  reorganize  its  databases  for  peak  performance,  without 
comolicated  human  intervention.  Furthermore.  IFPs  can  be  added  to  the 
svstem  to  connect  it  to  multiple  mainframe  computers,  allowing  more 
“front  end"  oower  to  be  added,  more  terminal  users  supported,  and 
different  tvpes  of  computers  to  share  the  same  databases.  il9:  — ). 

QUANTITATIVE  FACTORS 

Performance 

Besides  the  qualitative,  "nice  to  have"  features,  the  DBC/1012 
offers  a  distinct  oerformance  advantage  for  large  database  apo 1 l cat i ons . 
The  oerformance  gains  for  the  DBC/1012  varv  according  to  the  tvoe  of 
application,  but  overall  are  verv  impressive.  Batch  update  ana 
retrieval  programs  esoeciallv  show  significant  gains.  To  estimate  these 
gains,  the  oerformance  of  the  current  Active  Airman  Master  Personnel 
File  svstem  will  be  comoared  with  the  projected  performance  using  the 
DBC/ 1012. 

The  Active  Airman  v  A  A )  svstem  uses  a  set  of  software  called  the 
Generalized  Uodate  Svstem  (GUSi.  6US  is  a  database  and  software  svstem 


used  For  maintaining  the  master  oersonnel  tiles  and  other  sets  or 
information  with  similar  structure.  It  is  designed  to  etticientlv 
maintain  the  data  on  a  large  number  ot  similar  individual  records,  tor 
examole  all  Air  Force  otticers.  all  Air  National  Guard  airmen,  or  all 
Air  Force  suggestions.  At  its  heart  is  a  batch  program  which  orocesses 
both  update  and  retrieval  transactions  against  a  soecitic  master 
oersonnel  File.  It  consists  ot  a  basic  "host"  program  which  reads  and 
writes  the  database.  The  host  is  linked  to  manv  transaction  "modules." 
smaller  subprograms  which  contain  the  logic  tor  executing  each  seoarate 
tvpe  ot  transaction.  Several  reports  are  generated  From  the  outputs  ot 
the  batch  update  program,  based  on  outDut  transactions  which  are  written 
to  Files  as  part  oF  the  batch  uodate  process. 

Once  undated,  the  6US  databases  are  used  For  a  wide  varietv  ot  data 
retrievals.  Standardized,  periodic  report  programs  are  run  tor  manv 
users  who  have  recurring  needs  For  inFormation.  One-time  reoorts  can  be 
generated  using  the  Air  Force-developed  ATLAS  retrieval  language,  which 
allows  a  oersonnel  specialist  to  speciFv  his  inFormation  requirement  and 
report  Format,  and  run  it  as  scheduled  production  program.  The  ATLA5 
retrieval  caoabilitv  accounts  For  some  537;  oF  all  computer  use  in  the 
headquarters-level  POS  (18s--). 

Tvpicai  perFormance  times  tor  5U5  Drocesses  suDDorting  the  AA  master 
File  are  given  in  Table  2.  along  with  estimated  times  For  perForming 
similar  Functions  using  the  DBC/1012.  The  DBC/1012  times  were  derived 
From  the  vendor-supo 1 i ed  perFormance  curves  For  transaction  rates  based 
on  the  number  oF  processors  in  the  svstem,  and  assuming  that  the  system 
was  conFigured  with  6.  10.  or  20  AMPs  (reference  Figure  5 > .  As  can  be 
seen,  the  DBC/1012.  even  in  a  minimal  conF i gurati on ,  can  readilv  support 
the  Functions  oF  the  PD5. 


Price 

Lightning  perFormance  is  a i wavs  desirable,  but  it  does  no  good  iF  it 
is  prohibiti vel v  expensive.  Surprisingly,  the  DBC/1012  is  a  reiativeiv 
inexpensive  svstem.  A  studv  bv  the  Rome  Air  Development  Center  showed 
that  even  small  DBC/1012  systems  compare  Favon^blv  with  conventional 
general-purpose  systems  with  similar  capacities  117:7-14  -  7-15).  For 
a  svstem  large  enough  to  support  the  PD5  (20  gigabytes  or  larger),  the 
cost  oF  the  svstem  is  within  an  order  oF  magnitude  oF  the  cost  oF  disk 
storage  alone.  The  current  storage  reouired  For  the  database  oortions 
oF  the  PDS.  some  84  gigabytes,  requires  80  MSU501  disk  storage  units, 
costing  approx i matel v  $50,000  each,  or  $47,620  per  gigabyte  oF  storage 
(23:--).  While  the  orice  oF  a  Teradata  svstem  with  this  capacitv  is  not 
available,  the  largest  svstem  with  a  published  price  (4  IFPs.  20  AMPs. 
and  40  DSUs)  has  a  total  svstem  cost  oF  $1,475,000,  or  about  $71,600  per 
gigabyte  oF  storage  (reFerence  Table  4).  While  this  cost  is  507.  greater 
than  the  cost  oF  conventional  storage,  it  ignores  the  16  million 
instructions  oer  second  (MIPS)  oF  orocessing  power  added  to  the  svstem. 
and  the  workload  relieved  From  the  existina  mainFrame. 


w* 


v 

5 


35 


FIGURE  5.  TERADATA  DBC/1012  PERFORMANCE  CURVES  (19:27) 


jr+wm.  v--rvr^  <--» 


orocess  I  number 
I  or  r oms 


run  times  i processor / ei aosea )  in  hours 
DBC/1012  estimates 

current  6  AMP  10  AMP  20  AMP 


AA  uodate  I  100.000  -  5.3/20.0  2. 8/9. 6  1.7/5. 9  0.9/3. 

I  150.000 


ATLAS  I  540.000  2. 

retrieval  I 


0.13/0.17  0.07/0.10  0.04/0.05 


monthly  I  12.000.000  8.7/30.2  2. 8/9. 5  *  1. 6/5.6  *  0.8/2. 3 

extracts  I 


m 


*  assumes  sortina  bv  social  security  number  is  oerrormed 
bv  the  DBC/1012  in  oarallel  with  other  orocessina 


TABLE  2.  AA  PERFORMANCE  ESTIMATES  (24:--) 


I  FPs 

AMPs 

DSUs 

m 

or ocessi no 

corner (MIPS) 

or  i  ce 

or i ce/ Gb 

2 

2 

4 

2.  1  Gb 

4.0 

$  335.000 

$159. 500 

4 

4 

8 

4.  1 

8.0 

525 . 000 

128 . 000 

4 

8 

16 

8.2 

12.0 

770 . 000 

93.900 

4 

12 

24 

12.4 

1  o .  0 

950 . 000 

7  6 .  o  0  0 

8 

20 

40 

20.6 

28.0 

1 . 475 . 000 

7  1 . 600 

TABLE  3.  PRICES  OF  TERADATA  DBC/1012  SYSTEMS  i  19:25) 


CONCLUSIONS.  FINDINGS  AND  RECOMMENDATIONS 


INTRODUCTION 


Database  machines,  and  the  Teradata  D6C/1U12  in  oarticuiar.  nave 
several  distinct  advantages  over  conventional  qener al -pur oose  coaouiers 
which  recommend  them.  The  most  significant  are  their  cost,  caoacitv. 
o er f or  man c e .  and  reliability. 


CONCLUSIONS 


Cost 


Considering  its  capabilities,  the  DBC/1012  is  a  bargain.  Current 
disk  storage,  using  Honevwell  MSU501  disk  drives  costs  approx  1  mate! v 
*50.000  per  gigabyte.  Disk  storage  using  the  DBC/1012  costs  ITI.oOO  per 
gigabyte  in  the  size  range  needed  tor  the  PDS.  While  this  is  half  again 
as  much  as  the  cost  of  MSU501  storage,  it  includes  the  inherent 
orocessmg  power  of  the  DBC/1012  <16  -  28  MIPS),  and  the  Teradata  DBMS. 


Caoaci tv 


A  relatively  modest  configuration  of  the  DBC/1012  would  provide 
sufficient  storage  capacity  tor  the  entire  PD5  database,  even  witn 
complete  data  redundancy  using  the  fallback  method  of  duplicating  data. 
Both  performance  and  storage  capacity  can  be  easilv  increased  bv  adding 
hardware  modules  iIFPs  and  AMPs).  rather  than  making  large-scale 
modifications  or  replacements  of  mainframe  comouters.  The  added 
processing  power  of  the  DBG  relieves  the  mainframes  of  a  consideraole 
workload,  thereby  extending  its  useful  service  life.  Additionally, 
performance  of  the  DBC  can  be  improved  through  hardware  upgrades  to  the 
DBC  modules  themselves,  by  upgrading  the  processors,  adding  memory  to 
the  hardware  modules,  or  bv  increasing  the  capacity  ot  the  disk  units. 
Finally,  the  DBC  can  be  connected  to  several  "front-end"  mainframe 
comouters.  making  data  sharing  among  various  computers  very  easv. 


FINDINGS 


Performance 


The  DBC/1012  is  one  of  the  few  systems  which  does  not  rapidlv  fall 
victie  to  the  law  of  “diminishing  returns."  In  the  range  of  systems 
large  enough  to  operate  the  PDS.  the  performance  of  the  DBC  increases 
linearly  as  the  system  is  expanded.  This  means  that  most  forseeable 
performance  imoroveaents  can  be  made  simply  by  adding  hardware. 

Batch  uodate  performance  is  very  sensitive  to  the  number  of  AMPs 
available  in  the  system,  and  is  severely  constrained  in  small  systems. 
This  is  not  surprising,  considering  that  each  transaction  deals  with  a 
single  record,  and  once  every  AHP  is  working  on  a  transaction,  any 
subsequent  transaction  will  have  to  wait  for  the  AHP  processing  its 
“target”  record  to  complete  its  transaction  before  proceeding.  As  long 
as  batch  updates  are  performed  a  “record-at-a-time,  “  this  will  be  a 
major  bottleneck  for  batch  updates.  Transactions  which  could  be 
“broadcast"  to  the  entire  database  to  update  all  applicable  records 
would  greatly  increase  the  performance  of  batch  update  programs.  While 
this  is  certainly  a  desirable  situation,  the  real  world  wav  not  be  as 
cooperative  in  making  these  "mass  update"  transactions  the  predominant 
way  of  doing  business. 

Batch  inquiry  performance  is  the  area  where  impressive  gains  can 
readily  be  realized  using  the  database  machine.  Row  retrievals  are  very 
fast,  and  the  Ynet's  sort  capability  is  a  processing  bonus  which  is 
conceivably  as  important  as  the  database  management  facility.  Because 
of  the  great  performance  gains  which  can  be  realized  from  being 
presented  with  sorted  data  bv  the  database  machine,  any  facility  using 
the  DBC/1012  for  large  databases  should  procure  the  AHPs  with  the 
maximum  amount  of  sorting  capacity  available. 


Rel lability 


The  DBC/1012  is  highly  reliable  for  three  basic  reasons:  it  allows 
complete  redundancy  in  data  storage,  it  is  constructed  of  proven, 
off-the-shelf  components,  and  its  configuration  is  highly  modular.  The 
)ta  redundancy  and  modularity  are  especially  significant,  since  they 
allow  components  of  the  system  to  be  repaired  while  the  rest  of  the 
system  continues  ooerating.  This  combination  prevents  all  but  the  most 
catastrophic  of  failures  (such  as  a  total  power  outage)  from  putting  the 
system  out  of  service. 


Conversion 


Converting  software  to  run  on  the  DBC/1012  requires  that  the 
software  use  relational  databases.  This  is  easiest  in  those  systems 
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Nith  siaoie  data  structures  and  saall  orooraas  wmcn  are  not  tigntiv 
bound  to  the  structure  or  tne  data.  It  is  *ost  difficult  where  t ne  data 
is  highlv  structured,  tne  orooraas  are  large,  and  orograa  logic  is 
tightlv  bound  to  the  data  structure.  Unfortunately,  the  larger  uUS 
svsteas  of  the  PD5  fall  into  tne  latter  category.  To  soae  extent,  this 
contributed  to  the  difficulty  exoenenced  in  converting  these  svsteas 
during  the  REACQ  oroiect.  This  will  be  a  oroDlea  in  anv  future 
conversion  involving  a  oeoarture  froa  the  current  batch,  r ecor d -at-t 1 ae 
orocessing  conceot. 

RECOHHENDAT I QNa 

The  database  aachine  offers  caoabiiities  tnat  tne  PD5  cannot  attorn 
to  do  without.  Ooerating  on  a  database  aachine.  tne  svstea  will  eniov 
high  oerforaance.  easy  caoacitv  aanaoeaent  and  iaoroveaent.  and  nion 
reliability,  all  at  a  verv  reasonable  cost.  The  orobiea  is  getting  rroa 
nere  to  there:  deteraming  the  actual  svstea  size  and  convertino  tne 
aoolications  software  are  two  aaiar  hurdles  to  cross. 


Svstea  Si  z 1  no 


Sizing  a  svstea  is  laportant  for  a  siaoie  reason:  if  tne  nardware 
does  not  nave  sufficient  storage  or  orocessing  caoacitv  for  a  set  or 
aooi i cat  1 ons .  no  aaount  of  software  wizardry  will  aaxe  the  svstea 
oerfora.  Fortunately,  sizing  the  DBC/1012  svstea  is  relatively 
straightf orwaro.  An  initial  estiaate  of  svstea  size  can  be  aade  based 
on  the  size  of  the  database  to  be  suooorted  and  the  nuaber  of  aainfraae 
coaouters  to  be  connected  to  the  database  aachine.  These  estiaates  can 
then  oe  used  as  "first  guesses"  that  can  be  further  refined  using  tne 
Roae  Air  Develooaent  Center  aodel  for  oerforaance  estiaation.  v  i  7 : v l : 
Using  this  aodel  should  give  a  verv  close  estiaate  of  tne  necessary 
svstea  size. 


Software  Conversion 


Software  conversion  was  the  aajor  difficulty  of  the  REACQ  oroject. 
and  it  could  conceivably  aake  installing  a  database  aachine  laoracticai. 
Because  benchaarkina  is  an  integral  oart  of  selecting  and  acouirmg  a 
hardware  svstea.  the  Directorate  of  Personnel  Data  Svsteas  needs  to 
begin  orototvoe  develooaent  of  a  relational  database  i no  1 eaent at i on  as 
soon  as  oossible.  The  orototvoe  effort  should  have  three  orimarv  goais 
in  order  to  be  useful:  conceot  develooaent.  software  aethodoiogv.  and 
benchaarxino. 


Conceot  develooaent  is  siaolv  deciding  on  oasic  overall  orocessing 
strategies,  features  to  take  advantage  of.  and  oitfalls  to  avoid.  It 
should  orovide  the  basic  “road  aao“  for  the  orototvoe  deveiooers  to  keeo 
their  efforts  consistent  with  tne  aission  and  functions  of  the  FD5  and 
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il  h  aAS! 


to  avoid  duplication  or  false  starts.  With  a  well-thought  out  concept, 
the  software  for  the  orototvoe  can  be  developed. 


The  software  eethodologv  is  leoortant  because  it  will  deteraine  how 
production  software  will  be  developed  in  an  actual  database  Machine 
environeent.  This  can  be  begun  on  the  existing  Honevwell  DPS-8/70 
svstea  using  the  installed  Personal  Data  Querv  (PDQ)  svstea.  This  part 
of  the  prototvpe  effort  will  validate  the  orototvpe  concept  and  provide 
necessary  “lessons  learned*  well  before  the  actual  database  Machine  is 
ever  used. 

The  benchaark  is  the  "aoaent  of  truth*  for  selection  of  the  database 
Machine.  With  a  orototvoe  developed  using  PDQ.  conversion  of  the 
software  to  run  on  the  DBC/1012  should  be  ainiaal.  The  benchaark  can  be 
coMpared  to  the  actual  production  software  currently  in  use.  and  to  the 
prototvpe  software  as  it  was  run  on  the  eainfraae  svstea.  6iven  a 
significant  perforaance  laproveaent  using  the  benchaark  trial,  the 
DBC/1012  should  be  purchased  for  installation  as  an  integral  part  of  the 
PDS . 
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